AITopics | k-means method

Collaborating Authors

k-means method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Clustering of illustrations by atmosphere using a combination of supervised and unsupervised learning

Kubota, Keisuke, Okuda, Masahiro

arXiv.org Artificial IntelligenceJul-27-2023

The distribution of illustrations on social media, such as Twitter and Pixiv has increased with the growing popularity of animation, games, and animated movies. The "atmosphere" of illustrations plays an important role in user preferences. Classifying illustrations by atmosphere can be helpful for recommendations and searches. However, assigning clear labels to the elusive "atmosphere" and conventional supervised classification is not always practical. Furthermore, even images with similar colors, edges, and low-level features may not have similar atmospheres, making classification based on low-level features challenging. In this paper, this problem is solved using both supervised and unsupervised learning with pseudo-labels. The feature vectors are obtained using the supervised method with pseudo-labels that contribute to an ambiguous atmosphere. Further, clustering is performed based on these feature vectors. Experimental analyses show that our method outperforms conventional methods in human-like clustering on datasets manually classified by humans.

artificial intelligence, illustration, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.15099

Country:

North America > United States > California > San Diego County > San Diego (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(2 more...)

Genre: Research Report (0.65)

Industry:

Media > Film (0.54)
Leisure & Entertainment (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Asymptotics for The $k$-means

Zhang, Tonglin

arXiv.org Artificial IntelligenceNov-17-2022

Clustering is one of the most important unsupervised learning techniques for understanding the underlying data structures. The goal is to partition a data set into many subsets, called clusters, such that the observations within the subsets are the most homogeneous and the observations between the subsets are the most heterogeneous. Clustering is usually carried out by specifying a similarity or dissimilarity measure between observations. Examples include the k-means [17, 19, 29, 37], the k-medians [3], the k-modes [5], and the generalized k-means [2, 31, 45], as well as many of their modifications [21, 24, 42]. Among those, the k-means has been considered as one of the most straightforward and popular methods since it was proposed sixty years ago [23, 36]. Although it is well known, the investigation of the theoretical properties is still far behind, leading to difficulties in developing more precise k-means methods in practice. The goal of the present research is to propose a new concept called clustering consistency for the asymptotics of the k-means with a resulting clustering method better than the existing k-means methods adopted by many software packages, including those adopted by R and Python.

artificial intelligence, k-means method, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.10015

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Efficiency Evaluation of Banks with Many Branches using a Heuristic Framework and Dynamic Data Envelopment Optimization Approach: A Real Case Study

Kayvanfar, Vahid, Baziyad, Hamed, Sheikh, Shaya, Werner, Frank

arXiv.org Artificial IntelligenceSep-11-2022

Evaluating the efficiency of organizations and branches within an organization is a challenging issue for managers. Evaluation criteria allow organizations to rank their internal units, identify their position concerning their competitors, and implement strategies for improvement and development purposes. Among the methods that have been applied in the evaluation of bank branches, non-parametric methods have captured the attention of researchers in recent years. One of the most widely used non-parametric methods is the data envelopment analysis (DEA) which leads to promising results. However, the static DEA approaches do not consider the time in the model. Therefore, this paper uses a dynamic DEA (DDEA) method to evaluate the branches of a private Iranian bank over three years (2017-2019). The results are then compared with static DEA. After ranking the branches, they are clustered using the K-means method. Finally, a comprehensive sensitivity analysis approach is introduced to help the managers to decide about changing variables to shift a branch from one cluster to a more efficient one.

artificial intelligence, efficiency, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2209.04822

Country:

Asia > China (0.05)
Asia > Middle East > Republic of Türkiye (0.04)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
(12 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance (1.00)
Government (0.68)
Energy (0.67)
Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Aerodynamic Data Predictions Based on Multi-task Learning

Hu, Liwei, Xiang, Yu, Zhan, Jun, Shi, Zifang, Wang, Wenzheng

arXiv.org Artificial IntelligenceOct-14-2020

The quality of datasets is one of the key factors that affect the accuracy of aerodynamic data models. For example, in the uniformly sampled Burgers' dataset, the insufficient high-speed data is overwhelmed by massive low-speed data. Predicting high-speed data is more difficult than predicting low-speed data, owing to that the number of high-speed data is limited, i.e. the quality of the Burgers' dataset is not satisfactory. To improve the quality of datasets, traditional methods usually employ the data resampling technology to produce enough data for the insufficient parts in the original datasets before modeling, which increases computational costs. Recently, the mixtures of experts have been used in natural language processing to deal with different parts of sentences, which provides a solution for eliminating the need for data resampling in aerodynamic data modeling. Motivated by this, we propose the multi-task learning (MTL), a datasets quality-adaptive learning scheme, which combines task allocation and aerodynamic characteristics learning together to disperse the pressure of the entire learning task. The task allocation divides a whole learning task into several independent subtasks, while the aerodynamic characteristics learning learns these subtasks simultaneously to achieve better precision. Two experiments with poor quality datasets are conducted to verify the data quality-adaptivity of the MTL to datasets. The results show than the MTL is more accurate than FCNs and GANs in poor quality datasets.

artificial intelligence, denote, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2010.09475

Country:

Asia > China > Sichuan Province > Chengdu (0.05)
Asia > Middle East > Jordan (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Too Much Information Kills Information: A Clustering Perspective

#artificialintelligenceSep-17-2020, 14:15:51 GMT

Clustering is one of the most fundamental tools in the artificial intelligence area, particularly in the pattern recognition and learning theory. In this paper, we propose a simple, but novel approach for variance-based k-clustering tasks, included in which is the widely known k-means clustering. The proposed approach picks a sampling subset from the given dataset and makes decisions based on the data information in the subset only. With certain assumptions, the resulting clustering is provably good to estimate the optimum of the variance-based objective with high probability. Extensive experiments on synthetic datasets and real-world datasets show that to obtain competitive results compared with k-means method (Llyod 1982) and k-means method (Arthur and Vassilvitskii 2007), we only need 7 up to 15 k-means method and k-means method in at least 80 terms of the quality of clustering.

artificial intelligence, information kill information, machine learning, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Too Much Information Kills Information: A Clustering Perspective

Xu, Yicheng, Chau, Vincent, Wu, Chenchen, Zhang, Yong, Zissimopoulos, Vassilis, Zou, Yifei

arXiv.org Machine LearningSep-15-2020

Clustering is one of the most fundamental tools in the artificial intelligence area, particularly in the pattern recognition and learning theory. In this paper, we propose a simple, but novel approach for variance-based k-clustering tasks, included in which is the widely known k-means clustering. The proposed approach picks a sampling subset from the given dataset and makes decisions based on the data information in the subset only. With certain assumptions, the resulting clustering is provably good to estimate the optimum of the variance-based objective with high probability. Extensive experiments on synthetic datasets and real-world datasets show that to obtain competitive results compared with k-means method (Llyod 1982) and k-means++ method (Arthur and Vassilvitskii 2007), we only need 7% information of the dataset. If we have up to 15% information of the dataset, then our algorithm outperforms both the k-means method and k-means++ method in at least 80% of the clustering tasks, in terms of the quality of clustering. Also, an extended algorithm based on the same idea guarantees a balanced k-clustering result.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2009.07417

Country:

Europe > Greece > Attica > Athens (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

K-Means Clustering for Unsupervised Machine Learning

#artificialintelligenceOct-12-2019, 08:59:47 GMT

Artificial Intelligence (AI) and Machine Learning (ML) have revolutionized every aspect of our life and disrupted how we do business, unlike any other technology in the the history of mankind. Such disruption brings many challenges for professionals and businesses. In this article, I will provide an introduction to one of the most commonly used machine learning methods, K-Means. Machine learning is a scientific method that utilizes statistical methods along with the computational power of machines to convert data to wisdom that humans or the machine itself can use for taking certain actions. "It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention."

cluster center, k-means, learning, (13 more...)

#artificialintelligence

Genre: Overview (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback